Enabling text readability awareness during the micro planning phase of NLG applications
نویسندگان
چکیده
Currently, there is a lack of text complexity awareness in NLG systems. Much attention has been given to text simplification. However, based upon results of an experiment, we unveiled that sophisticated readers in fact would rather read more sophisticated text, instead of the simplest text they could get. Therefore, we propose a technique that considers different readability levels during the micro planning phase of an NLG system. Our technique considers grammatical and syntactic choices, as well as lexical items, when generating text. The application uses the domain of descriptive summaries of line graphs as its use case. The technique proposed uses learning for identifying features of text complexity; a graph search algorithm for efficient aggregation given a target reading level, and a combination of language modeling and word vectors for the creation of a domain-aware synset which allows the creation of disambiguated lexicon that is appropriate to different reading levels. We found that generating text at different target reading levels is indeed preferred by readers with varying reading abilities. To the best of our knowledge, this is the first time readability awareness is considered in the micro planning phase of NLG systems.
منابع مشابه
Generating basic skills reports for low-skilled readers
We describe SkillSum, a Natural Language Generation (NLG) system that generates a personalised feedback report for someone who has just completed a screening assessment of their basic literacy and numeracy skills. Because many SkillSum users have limited literacy, the generated reports must be easily comprehended by people with limited reading skills; this is the most novel aspect of SkillSum, ...
متن کاملRecent Advances in Natural Language Generation: A Survey and Classification of the Empirical Literature
Natural Language Generation (NLG) is defined as the systematic approach for producing human understandable natural language text based on nontextual data or from meaning representations. This is a significant area which empowers human-computer interaction. It has also given rise to a variety of theoretical as well as empirical approaches. This paper intends to provide a detailed overview and a ...
متن کاملCreating Training Corpora for NLG Micro-Planning
In this paper, we present a novel framework for semi-automatically creating linguistically challenging microplanning data-to-text corpora from existing Knowledge Bases. Because our method pairs data of varying size and shape with texts ranging from simple clauses to short texts, a dataset created using this framework provides a challenging benchmark for microplanning. Another feature of this fr...
متن کاملEfficient algorithm for Context Sensitive Aggregation in Natural Language generation
Aggregation is a sub-task of Natural Language Generation (NLG) that improves the conciseness and readability of the text outputted by NLG systems. Till date, approaches towards the aggregation task have been predominantly manual (manual analysis of domain specific corpus and development of rules). In this paper, a new algorithm for aggregation in NLG is proposed, that learns context sensitive a...
متن کاملOPTIMAL LOT-SIZING DECISIONS WITH INTEGRATED PURCHASING, MANUFACTURING AND ASSEMBLING FOR REMANUFACTURING SYSTEMS
This work applies fuzzy sets to the integration of purchasing, manufacturing and assembling of production planning decisions with multiple suppliers, multiple components and multiple machines in remanufacturing systems. The developed fuzzy multi-objective linear programming model (FMOLP) simultaneously minimizes total costs, total $text{CO}_2$ emissions and total lead time with reference to cus...
متن کامل